Co-Regularized Ensemble for Feature Selection

نویسندگان

  • Yahong Han
  • Yi Yang
  • Xiaofang Zhou
چکیده

Supervised feature selection determines feature relevance by evaluating feature’s correlation with the classes. Joint minimization of a classifier’s loss function and an `2,1-norm regularization has been shown to be effective for feature selection. However, the appropriate feature subset learned from different classifiers’ loss function may be different. Less effort has been made on improving the performance of feature selection by the ensemble of different classifiers’ criteria and take advantages of them. Furthermore, for the cases when only a few labeled data per class are available, overfitting would be a potential problem and the performance of each classifier is restrained. In this paper, we add a joint `2,1-norm on multiple feature selection matrices to ensemble different classifiers’ loss function into a joint optimization framework. This added co-regularization term has twofold role in enhancing the effect of regularization for each criterion and uncovering common irrelevant features. The problem of over-fitting can be alleviated and thus the performance of feature selection is improved. Extensive experiment on different data types demonstrates the effectiveness of our algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

سودمندی رگرسیون‌های تجمیعی و روش‌های انتخاب متغیرهای پیش‌بین بهینه در پیش‌بینی بازده سهام

مقاله حاضر به بررسی سودمندی رگرسیون‌های تجمیعی و روش‌های انتخاب متغیرهای پیش‌بین بهینه (شامل روش مبتنی بر همبستگی و ریلیف) برای پیش‌بینی بازده سهام شرکت‌های پذیرفته شده در بورس اوراق بهادار تهران می‌پردازد. به‌منظور ارزیابی عملکرد رگرسیون تجمیعی، معیارهای ارزیابی (شامل میانگین قدرمطلق درصد خطا، مجذور مربع میانگین خطا و ضریب تعیین) مربوط به پیش‌بینی این روش، با رگرسیون خطی و شبکه‌های عصبی مصنوعی...

متن کامل

Ensemble Feature Weighting Based on Local Learning and Diversity

Recently, besides the performance, the stability (robustness, i.e., the variation in feature selection results due to small changes in the data set) of feature selection is received more attention. Ensemble feature selection where multiple feature selection outputs are combined to yield more robust results without sacrificing the performance is an effective method for stable feature selection. ...

متن کامل

!1-regularized ensemble learning

Methods that use an !1-norm to encourage model sparsity are now widely applied across many disciplines. However, aggregating such sparse models across fits to resampled data remains an open problem. Because resampling approaches have been shown to be of great utility in reducing model variance and improving variable selection, a method able to generate a single sparse solution from multiple fit...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013